Cooperating Peers for Content-Oriented XML-Retrieval

نویسندگان

  • Judith Winter
  • Oswald Drobnik
چکیده

Semi-structured documents formatted with the extensible markup language (XML) are gaining wide use by a whole range of applications including E-Commerce, E-Business, EScience, Digital Libraries (DL), File Sharing, and in the last years especially by applications for Peer-to-Peer (P2P) systems. P2P architectures have been identified as an efficient means of ad-hoc collaboration and information sharing among large, diverse, and dynamic sets of user. However, current P2P search engines for XML-documents lack the use of information retrieval methods to efficiently search XML collections for relevant information. This article proposes a search engine for P2P systems that applies an extension of the vector space model and exploits structural information to compute relevance of XMLdocuments, and thus may significantly improve retrieval performance. We concentrate on the cooperation of peers that perform a distributed query execution through cooperated retrieval and ranking of dynamic XML documents. The interaction between the participating peers is based on a structured P2P-network and uses an adaption of the DHT-algorithm Kademlia.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling Vague Content and Structure Querying in XML Retrieval with a Probabilistic Object-Relational Framework

Many XML retrieval applications require relevance-oriented ranking of retrieved elements in order to capture the vagueness inherent to the information retrieval process. This relevance-oriented ranking should not only support vagueness at the content level, but also at the structural level. In this paper, we use a probabilistic object-relational framework to model representation and retrieval s...

متن کامل

Content oriented retrieval on document centric XML

XML is the perfect format for storing (mostly) textual documents in a digital library; its flexibility enables users to store both highly structured data (like database records) and free text in the same document. The data-centric parts can be searched using query languages like XPath and XQuery, where exact conditions on the structure can be imposed. For digital libraries, however, it is impor...

متن کامل

XML Retrieval

DEFINITION Text documents often contain a mixture of structured and unstructured content. One way to format this mixed content is according to the adopted W3C standard for information repositories and exchanges, the eXtensible Mark-up Language (XML). In contrast to HTML, which is mainly layout-oriented, XML follows the fundamental concept of separating the logical structure of a document from i...

متن کامل

Evaluating the effectiveness of content-oriented XML retrieval

The INEX initiative is a collaborative effort for building an infrastructure to evaluate the effectiveness of content-oriented XML retrieval. In this paper, we show that evaluation methods developed for standard test collection must be modified in order to deal with retrieval of structured documents. Specifically, size and overlap of document components must be taken into account. For this purp...

متن کامل

XML Information Retrieval Systems: A Survey

The continuous growth in the XML information repositories has been matched by increasing efforts in development of XML retrieval systems, in large parts aiming at supporting content-oriented XML retrieval. These systems exploit the available structural information, as market up in XML documents, in order to return documents componentsthe so called XML elements-instead of the complement document...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008